67 research outputs found

    Improving the Caenorhabditis elegans Genome Annotation Using Machine Learning

    Get PDF
    For modern biology, precise genome annotations are of prime importance, as they allow the accurate definition of genic regions. We employ state-of-the-art machine learning methods to assay and improve the accuracy of the genome annotation of the nematode Caenorhabditis elegans. The proposed machine learning system is trained to recognize exons and introns on the unspliced mRNA, utilizing recent advances in support vector machines and label sequence learning. In 87% (coding and untranslated regions) and 95% (coding regions only) of all genes tested in several out-of-sample evaluations, our method correctly identified all exons and introns. Notably, only 37% and 50%, respectively, of the presently unconfirmed genes in the C. elegans genome annotation agree with our predictions, thus we hypothesize that a sizable fraction of those genes are not correctly annotated. A retrospective evaluation of the Wormbase WS120 annotation [1] of C. elegans reveals that splice form predictions on unconfirmed genes in WS120 are inaccurate in about 18% of the considered cases, while our predictions deviate from the truth only in 10%–13%. We experimentally analyzed 20 controversial genes on which our system and the annotation disagree, confirming the superiority of our predictions. While our method correctly predicted 75% of those cases, the standard annotation was never completely correct. The accuracy of our system is further corroborated by a comparison with two other recently proposed systems that can be used for splice form prediction: SNAP and ExonHunter. We conclude that the genome annotation of C. elegans and other organisms can be greatly enhanced using modern machine learning technology

    The importance of being regular: Caenorhabditis elegans and Pristionchus pacificus defecation mutants are hypersusceptible to bacterial pathogens

    No full text
    Bacterial pathogens have shaped the evolution and survival of organisms throughout history, but little is known about the evolution of virulence mechanisms and the counteracting defence strategies of host species. The nematode model organisms, Caenorhabditis elegans and Pristionchus pacificus, feed on a wealth of bacteria in their natural soil environment, some of which can cause mortality. Previously, we have shown that these nematodes differ in their susceptibility to a range of human and insect pathogenic bacteria, with P. pacificus showing extreme resistance compared with C. elegans. Here, we isolated 400 strains of Bacillus from soil samples and fed their spores to both nematodes. Spores of six Bacillus strains were found to kill C. elegans but not P. pacificus. While the majority of Bacillus strains are benign to nematodes, observed pathogenicity is restricted to either the spore or the vegetative stage. We used the rapid C. elegans killer strain (Bacillus sp. 142) to conduct a screen for hypersusceptible P. pacificus mutants. Two P. pacificus mutants with severe muscle defects and an extended defecation cycle that die rapidly on Bacillus spores were isolated. These genes were identified to be homologous to C. elegans, unc-22 and unc-13. To test whether a similar relationship between defecation and bacterial pathogenesis exists in C. elegans, we used five known defecation mutants. Quantification of the defecation cycle in mutants also revealed a severe effect on survival in C. elegans. Thus, intestinal peristalsis is critical to nematode health and contributes significantly to survival when fed Gram-positive bacteria

    Data from: First insights into the nature and evolution of antisense transcription in nematodes

    No full text
    Background: The development of multicellular organisms is coordinated by various gene regulatory mechanisms that ensure correct spatio-temporal patterns of gene expression. Recently, the role of antisense transcription in gene regulation has moved into focus of research. To characterize genome-wide patterns of antisense transcription and to study their evolutionary conservation, we sequenced a strand-specific RNA-seq library of the nematode Pristionchus pacificus. Results: We identified 1112 antisense configurations of which the largest group represents 465 antisense transcripts (ASTs) that are fully embedded in introns of their host genes. We find that most ASTs show homology to protein-coding genes and are overrepresented in proteomic data. Together with the finding, that expression levels of ASTs and host genes are uncorrelated, this indicates that most ASTs in P. pacificus do not represent non-coding RNAs and do not exhibit regulatory functions on their host genes. We studied the evolution of antisense gene pairs across 20 nematode genomes, showing that the majority of pairs is lineage-specific and even the highly conserved vps-4, ddx-27, and sel-2 loci show abundant structural changes including duplications, deletions, intron gains and loss of antisense transcription. In contrast, host genes in general, are remarkably conserved and encode exceptionally long introns leading to unusually large blocks of conserved synteny. Conclusions: Our study has shown that in P. pacificus antisense transcription as such does not define non-coding RNAs but is rather a feature of highly conserved genes with long introns. We hypothesize that the presence of regulatory elements imposes evolutionary constraint on the intron length, but simultaneously, their large size makes them a likely target for translocation of genomic elements including protein-coding genes that eventually end up as ASTs

    Tandem-Repeat Patterns and Mutation Rates in Microsatellites of the Nematode Model Organism Pristionchus pacificus

    Get PDF
    Modern evolutionary biology requires integrative approaches that combine life history, population structure, ecology, and development. The nematode Pristionchus pacificus has been established as a model system in which these aspects can be studied in one organism. P. pacificus has well-developed genetic, genomic, and transgenic tools and its ecologic association with scarab beetles is well described. A recent study provided first mutation rate estimates based on mitochondrial genome sequencing and mutation accumulation line experiments that help resolve rather ancient evolutionary branches. Here, we analyzed the tandem-repeat pattern and studied spontaneous mutation rates for microsatellite markers by using the previously generated mutation accumulation lines. We found that 0.59%–3.83% of the genome is composed of short tandem repeats. We developed 41 microsatellite markers, randomly chosen throughout the genome and analyzed them in 82 mutation accumulation lines after 142 generations. A total of 31 mutations were identified in these lines. There was a strong correlation between allele size and mutation rate in P. pacificus, similar to Caenorhabditis elegans. In contrast to C. elegans, however, there is no evidence for a bias toward multistep mutations. The mutation spectrum of microsatellite loci in P. pacificus shows more insertions than deletions, indicating a tendency toward lengthening, a process that might have contributed to the increase in genome size. The mutation rates obtained for individual microsatellite markers provide guidelines for divergence time estimates that can be applied in P. pacificus next-generation sequencing approaches of wild isolates

    Divergent combinations of cis-regulatory elements control the evolution of phenotypic plasticity.

    No full text
    The widespread occurrence of phenotypic plasticity across all domains of life demonstrates its evolutionary significance. However, how plasticity itself evolves and how it contributes to evolution is poorly understood. Here, we investigate the predatory nematode Pristionchus pacificus with its feeding structure plasticity using recombinant-inbred-line and quantitative-trait-locus (QTL) analyses between natural isolates. We show that a single QTL at a core developmental gene controls the expression of the cannibalistic morph. This QTL is composed of several cis-regulatory elements. Through CRISPR/Cas-9 engineering, we identify copy number variation of potential transcription factor binding sites that interacts with a single intronic nucleotide polymorphism. Another intronic element eliminates gene expression altogether, mimicking knockouts of the locus. Comparisons of additional isolates further support the rapid evolution of these cis-regulatory elements. Finally, an independent QTL study reveals evidence for parallel evolution at the same locus. Thus, combinations of cis-regulatory elements shape plastic trait expression and control nematode cannibalism

    Improving Transgenesis Efficiency and CRISPR-Associated Tools Through Codon Optimization and Native Intron Addition in Pristionchus Nematodes

    No full text
    A lack of appropriate molecular tools is one obstacle that prevents in-depth mechanistic studies in many organisms. Transgenesis, clustered regularly interspaced short palindromic repeats (CRISPR)-associated engineering, and related tools are fundamental in the modern life sciences, but their applications are still limited to a few model organisms. In the phylum Nematoda, transgenesis can only be performed in a handful of species other than Caenorhabditis elegans, and additionally, other species suffer from significantly lower transgenesis efficiencies. We hypothesized that this may in part be due to incompatibilities of transgenes in the recipient organisms. Therefore, we investigated the genomic features of 10 nematode species from three of the major clades representing all different lifestyles. We found that these species show drastically different codon usage bias and intron composition. With these findings, we used the species Pristionchus pacificus as a proof of concept for codon optimization and native intron addition. Indeed, we were able to significantly improve transgenesis efficiency, a principle that may be usable in other nematode species. In addition, with the improved transgenes, we developed a fluorescent co-injection marker in P. pacificus for the detection of CRISPR-edited individuals, which helps considerably to reduce associated time and costs

    Comparative genomics and community curation further improve gene annotations in the nematode Pristionchus pacificus

    No full text
    Background: Nematode model organisms such as Caenorhabditis elegans and Pristionchus pacificus are powerful systems for studying the evolution of gene function at a mechanistic level. However, the identification of P. pacificus orthologs of candidate genes known from C. elegans is complicated by the discrepancy in the quality of gene annotations, a common problem in nematode and invertebrate genomics. Results: Here, we combine comparative genomic screens for suspicious gene models with community-based curation to further improve the quality of gene annotations in P. pacificus. We extend previous curations of one-to-one orthologs to larger gene families and also orphan genes. Cross-species comparisons of protein lengths, screens for atypical domain combinations and species-specific orphan genes resulted in 4311 candidate genes that were subject to community-based curation. Corrections for 2946 gene models were implemented in a new version of the P. pacificus gene annotations. The new set of gene annotations contains 28,896 genes and has a single copy ortholog completeness level of 97.6%. Conclusions: Our work demonstrates the effectiveness of comparative genomic screens to identify suspicious gene models and the scalability of community-based approaches to improve the quality of thousands of gene models. Similar community-based approaches can help to improve the quality of gene annotations in other invertebrate species, including parasitic nematodes

    phylogenetic analysis of antisense gene pairs in nematodes

    No full text
    This archived folder contains sequences, alignments and trees that were used for the publication "First insights into the nature and evolution of antisense transcription in nematodes" (Figure 3,4,5) by Rödelsperger, Menden, Serobyan, Witte, Baskaran (BMC Evolutionary Biology, 2016). The protein sequences (.fa files) were obtained by aligning C. elegans reference sequences against 19 other nematode genomes with the help of the Software exonerate. Multiple sequence alignments (*_alignment.fa) for homologous proteins were generated by the MUSCLE software and Maximum-likelihood trees were estimated by the phangorn R-package (.nexml files)

    Conserved nuclear hormone receptors controlling a novel plastic trait target fast-evolving genes expressed in a single cell

    Get PDF
    Environment shapes development through a phenomenon called developmental plasticity. Deciphering its genetic basis has potential to shed light on the origin of novel traits and adaptation to environmental change. However, molecular studies are scarce, and little is known about molecular mechanisms associated with plasticity. We investigated the gene regulatory network controlling predatory vs. non-predatory dimorphism in the nematode Pristionchus pacificus and found that it consists of genes of extremely different age classes. We isolated mutants in the conserved nuclear hormone receptor nhr-1 with previously unseen phenotypic effects. They disrupt mouth-form determination and result in animals combining features of both wild-type morphs. In contrast, mutants in another conserved nuclear hormone receptor nhr-40 display altered morph ratios, but no intermediate morphology. Despite divergent modes of control, NHR-1 and NHR-40 share transcriptional targets, which encode extracellular proteins that have no orthologs in Caenorhabditis elegans and result from lineage-specific expansions. An array of transcriptional reporters revealed co-expression of all tested targets in the same pharyngeal gland cell. Major morphological changes in this gland cell accompanied the evolution of teeth and predation, linking rapid gene turnover with morphological innovations. Thus, the origin of feeding plasticity involved novelty at the level of genes, cells and behavior
    corecore